[1.3] bump cgroups to v0.0.4 + related test fixes#4897
[1.3] bump cgroups to v0.0.4 + related test fixes#4897cyphar merged 5 commits intoopencontainers:release-1.3from
Conversation
Instead of providing systemd CPU quota value (CPUQuotaPerSec), calculate it based on how opencontainers/cgroups/systemd handles it (see addCPUQuota). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For changes, see https://github.com/opencontainers/cgroups/releases/tag/v0.0.2 Fix integration tests according to changes in [1] (now the CPU quota value set is rounded the same way systemd does it). [1]: opencontainers/cgroups#4 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since opencontainers/cgroups v0.0.2 (commit b206a01), all stuct Resources fields are annotated with "omitempty" attribute. As a result, the loaded configuration may have Resources == nil. It is totally OK (rootless containers may have no resources configured) except since commit 6c5441e, cgroup v1 fs manager requires Resources to be set in the call to NewManager (this is a cgroup v1 deficiency, or maybe our implementation deficiency, or both). To work around this, let's add code to ensure Resources is never nil after loading from state.json. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For changelog, see https://github.com/opencontainers/cgroups/releases/tag/v0.0.3 This fixes two runc issues: 1. JSON incompatibility introduced in cgroups v0.0.2 (see opencontainers/cgroups#22). 2. Bad CPU shares to CPU weight conversion (see opencontainers#4772). Due to item 2, modify some tests accordingly. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
@jaredledvina Thanks! I'm fine with the idea of a backport if it fixes a bug in prod. I haven't been following that issue, though, can you comment briefly what issue do you hit in prod? |
|
👋 Hey @rata, sure thing! In short, after migrating to cgroup v2, containers that are created with a low cpu request values will be given a significantly lower relative cpu weight. As a specific example, we run 1 container that is configured to get 20 mcores. After migrating to cgroup v2, the determined cpu weight is equivalent to if this container requested 0. The updated formula in opencontainers/cgroups#20 improves the granularity of the allocated cpu weights for containers in this sort of situation. kubernetes/kubernetes#131216 (comment) has some nice vizuals of the algorithm improvement. |
|
Cool, thanks! @kolyshkin any thoughts on this? |
|
I just want to point out that I didn't really plan to do a 1.3.2 release for another month (meaning this won't be much faster than waiting for 1.4.0), but I could be convinced to cut a release earlier if necessary... |
|
Ah okay, do you all happen to know when containerd opt's to bump to new minor releases of runc? I'm working on cutting a release off our fork for now to unblock the immediate issue regardless. |
I'm all for it. In addition, I was looking into whether we should port cgroups v0.0.4, too (which fixes the annoying bug #4798). The only other change in v0.0.4 which is of interest here is opencontainers/cgroups#24, and it looks like it can't bring other regressions. So maybe we should shove v0.0.4 here, too; @jaredledvina can you please cherry-pick #4808 as well? |
Bumps [github.com/opencontainers/cgroups](https://github.com/opencontainers/cgroups) from 0.0.3 to 0.0.4. - [Release notes](https://github.com/opencontainers/cgroups/releases) - [Changelog](https://github.com/opencontainers/cgroups/blob/main/RELEASES.md) - [Commits](opencontainers/cgroups@v0.0.3...v0.0.4) --- updated-dependencies: - dependency-name: github.com/opencontainers/cgroups dependency-version: 0.0.4 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>
|
@kolyshkin - Thanks for taking a look! I've cherry-picked fc8162e here too. |
kolyshkin
left a comment
There was a problem hiding this comment.
LGTM assuming tests are green
|
I've updated the title and description to better reflect the current state of this PR, and also added labels and a milestone. |
|
Actually, I might as well just prepare a release with this next week. |
|
I greatly appreciate all the help, thank you! |
|
@cyphar - I'm not sure if you all have a better place for me to ask but, I just wanted to check and see if you planned on cutting 1.3.2 this week. |
This is a backport of
to the 1.3 release branch.
I'm looking to get the fix for #4772 back on 1.3 (we're currently running 1.3.1) instead of waiting until the end of October for 1.4 to be released per https://github.com/opencontainers/runc/blob/main/RELEASES.md. We have noticed the side-effects of this issue in production while migrating to cgroup v2.